Search CORE

11 research outputs found

Quasi-stationary distributions as centrality measures of reducible graphs

Author: Avrachenkov Konstantin
Borkar Vivek
Nemirovsky Danil
Publication venue
Publication date: 01/01/2007
Field of study

Random walk can be used as a centrality measure of a directed graph. However, if the graph is reducible the random walk will be absorbed in some subset of nodes and will never visit the rest of the graph. In Google PageRank the problem was solved by introduction of uniform random jumps with some probability. Up to the present, there is no clear criterion for the choice this parameter. We propose to use parameter-free centrality measure which is based on the notion of quasi-stationary distribution. Specifically we suggest four quasi-stationary based centrality measures, analyze them and conclude that they produce approximately the same ranking. The new centrality measures can be applied in spam detection to detect ``link farms'' and in image search to find photo albums

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Tensor approach to mixed high-order moments of absorbing Markov chains

Author: Nemirovsky Danil
Publication venue: HAL CCSD
Publication date: 01/01/2009
Field of study

Moments of absorbing Markov chain are considered. First moments and non-mixed second moments are determined in classical textbooks such as the book of J. Kemeny and J. Snell ``Finite Markov Chains''. The reason is that the first moments and the non-mixed second moments can be easily expressed in a matrix form. Since the representation of mixed moments of higher orders in a matrix form is not straightforward, if ever possible, they were not calculated. The gap is filled by this paper. Tensor approach to the mixed high-order moments is proposed and compact closed-form expressions for the moments are discovered

INRIA a CCSD electronic archive server

Monte Carlo Methods for Top-k Personalized PageRank Lists and Name Disambiguation

Author: Danil Nemirovsky
Elena Smirnova
Konstantin Avrachenkov
Marina Sokol
Nelly Litvak
Thème Com
Publication venue
Publication date: 01/01/2010
Field of study

We study a problem of quick detection of top-k Personalized PageRank lists. This problem has a number of important applications such as finding local cuts in large graphs, estimation of similarity distance and name disambiguation. In particular, we apply our results to construct efficient algorithms for the person name disambiguation problem. We argue that when finding top-k Personalized PageRank lists two observations are important. Firstly, it is crucial that we detect fast the top-k most important neighbours of a node, while the exact order in the top-k list as well as the exact values of PageRank are by far not so crucial. Secondly, a little number of wrong elements in top-k lists do not really degrade the quality of top-k lists, but it can lead to significant computational saving. Based on these two key observations we propose Monte Carlo methods for fast detection of top-k Personalized PageRank lists. We provide performance evaluation of the proposed methods and supply stopping criteria. Then, we apply the methods to the person name disambiguation problem. The developed algorithm for the person name disambiguation problem has achieved the second place in the WePS 2010 competition

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

University of Twente Research Information

Des approches pour le calcul du PageRank fondées sur les méthodes de Monte Carlo et chaînes de Markov

Author: AVRACHENKOV Konstantin
NEMIROVSKY Danil
Publication venue
Publication date: 01/01/2010
Field of study

NICE-BU Sciences (060882101) / SudocSudocFranceF

OpenGrey Repository

Pagerank based clustering of hypertext document collections

Author: Avrachenkov Konstantin
Dobrynin Vladimir
Nemirovsky Danil
Pham Son Kim
Smirnova Elena
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2008
Field of study

International audienceClustering hypertext document collection is an important task in Information Retrieval. Most clustering methods are based on document content and do not take into account the hyper-text links. Here we propose a novel PageRank based clustering (PRC) algorithm which uses the hypertext structure. The PRC algorithm produces graph partitioning with high modularity and coverage. The comparison of the PRC algorithm with two content based clustering algorithms shows that there is a good match between PRC clustering and content based clustering

Crossref

INRIA a CCSD electronic archive server